Forest Management Study

Clark, C. M., Richkus, J., Jones, P. W., Phelan, J., Burns, D. A., de Vries, W., Du, E., Fenn, M. E., Jones, L., & Watmough, S. A. (2019). A synthesis of ecosystem management strategies for forests in the face of chronic nitrogen deposition. Environmental Pollution, 248, 1046–1058. https://doi-org.proxy.lib.miamioh.edu/10.1016/j.envpol.2019.02.006

In [1]:
# You need to run this block of code to import the libraries that are needed for the data manipulations, visualizations, etc. 

from pydoc import help  # can type in the python console `help(name of function)` to get the documentation
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.preprocessing import scale
from sklearn.decomposition import PCA
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from scipy import stats
from IPython.display import display, HTML

Import data, data prep, and EDA

In [2]:
# Read csv file into pandas dataframe

forest = pd.read_csv("ForestData.csv")
In [3]:
# Examine first few cases
forest.head()
Out[3]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Magnitude.or.degree.of.change Magnitude.or.degree.of.change..text. Response..Measurement Response.Number.of.Observations Response.Standard.Deviation Control.Reference.Measurement Control.Number.of.Observations Control.Standard.Deviation general.area Location
0 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 N 0.05 community composition Ground cover ... -30.77% Decrease 0.9 84.0 0.9 1.3 56.0 0.5 PNW Oregon
1 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 N 0.05 community composition Ground cover ... -15.38% Decrease 1.1 72.0 0.2 1.3 56.0 0.5 PNW Oregon
2 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Ground cover ... -53.85% Decrease 0.6 24.0 0.9 1.3 56.0 0.5 PNW Oregon
3 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Ground cover ... -7.69% Decrease 1.2 28.0 0.9 1.3 56.0 0.5 PNW Oregon
4 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Ground cover ... -15.38% Decrease 1.1 16.0 0.9 1.3 56.0 0.5 PNW Oregon

5 rows × 23 columns

In [4]:
# List the columns
list(forest.columns)
Out[4]:
['Reference',
 'Habitat.Type',
 'Management.Method',
 'Management.Intensity',
 'Duration.of.treatment..years.',
 'Years.since.Treatment..if.applicable.',
 'Significant.',
 'Alpha',
 'Management.Effect.Type',
 'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
 'Endpoint',
 'Measurement.Unit',
 'Endpoint.Modifier',
 'Magnitude.or.degree.of.change',
 'Magnitude.or.degree.of.change..text.',
 'Response..Measurement',
 'Response.Number.of.Observations',
 'Response.Standard.Deviation',
 'Control.Reference.Measurement',
 'Control.Number.of.Observations',
 'Control.Standard.Deviation',
 'general.area',
 'Location']
In [5]:
forest.dtypes
Out[5]:
Reference                                                       object
Habitat.Type                                                    object
Management.Method                                               object
Management.Intensity                                            object
Duration.of.treatment..years.                                   object
Years.since.Treatment..if.applicable.                           object
Significant.                                                    object
Alpha                                                           object
Management.Effect.Type                                          object
Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.     object
Endpoint                                                        object
Measurement.Unit                                                object
Endpoint.Modifier                                               object
Magnitude.or.degree.of.change                                   object
Magnitude.or.degree.of.change..text.                            object
Response..Measurement                                          float64
Response.Number.of.Observations                                float64
Response.Standard.Deviation                                    float64
Control.Reference.Measurement                                  float64
Control.Number.of.Observations                                 float64
Control.Standard.Deviation                                     float64
general.area                                                    object
Location                                                        object
dtype: object

Filtering out non relevant cases

In [6]:
 df = forest[forest['Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.'].isin(['Plant']) ]
In [7]:
df
Out[7]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Magnitude.or.degree.of.change Magnitude.or.degree.of.change..text. Response..Measurement Response.Number.of.Observations Response.Standard.Deviation Control.Reference.Measurement Control.Number.of.Observations Control.Standard.Deviation general.area Location
20 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... -23.63% Decrease 18.100000 84.0 3.600000 23.700000 56.0 6.200000 PNW Oregon
21 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... -36.29% Decrease 15.100000 72.0 1.600000 23.700000 56.0 6.200000 PNW Oregon
22 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... -5.91% Decrease 22.300000 24.0 8.200000 23.700000 56.0 6.200000 PNW Oregon
23 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... -35.86% Decrease 15.200000 28.0 4.700000 23.700000 56.0 6.200000 PNW Oregon
24 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 Y 0.05 community composition Plant ... -70.04% Decrease 7.100000 16.0 3.300000 23.700000 56.0 6.200000 PNW Oregon
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1936 FFS Data - Florida Coastal Plain Forest Prescribed Burn NaN NaN 4 NaN NaN community composition Plant ... #DIV/0! #DIV/0! 0.000000 1.0 NaN 0.000000 1.0 NaN SE FL
1937 FFS Data - Gulf Coastal Plain Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... -72.73% Decrease 0.010000 3.0 0.000000 0.036667 3.0 0.035119 SE AL
1938 FFS Data - Gulf Coastal Plain Forest Prescribed Burn NaN NaN 2 NaN NaN community composition Plant ... #DIV/0! #DIV/0! 0.003333 3.0 0.005774 0.000000 3.0 0.000000 SE AL
1939 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... 18.18% Increase 0.043333 3.0 0.035119 0.036667 3.0 0.035119 SE AL
1940 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... #DIV/0! #DIV/0! 0.023333 3.0 0.040415 0.000000 3.0 0.000000 SE AL

718 rows × 23 columns

In [8]:
a=['non-native' , 'alien species']

df = df[~df['Endpoint.Modifier'].isin(a)]
In [9]:
df
Out[9]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Magnitude.or.degree.of.change Magnitude.or.degree.of.change..text. Response..Measurement Response.Number.of.Observations Response.Standard.Deviation Control.Reference.Measurement Control.Number.of.Observations Control.Standard.Deviation general.area Location
20 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... -23.63% Decrease 18.100000 84.0 3.600000 23.700000 56.0 6.200000 PNW Oregon
21 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... -36.29% Decrease 15.100000 72.0 1.600000 23.700000 56.0 6.200000 PNW Oregon
22 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... -5.91% Decrease 22.300000 24.0 8.200000 23.700000 56.0 6.200000 PNW Oregon
23 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... -35.86% Decrease 15.200000 28.0 4.700000 23.700000 56.0 6.200000 PNW Oregon
24 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 Y 0.05 community composition Plant ... -70.04% Decrease 7.100000 16.0 3.300000 23.700000 56.0 6.200000 PNW Oregon
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 20.91% Increase 2.313333 3.0 1.927287 1.913333 3.0 0.592818 SE AL
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 0.53% Increase 1.893333 3.0 0.015275 1.883333 3.0 0.496521 SE NC
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 16.25% Increase 2.170000 3.0 0.286182 1.866667 3.0 0.617441 SE NC
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... -26.90% Decrease 1.476667 3.0 0.167730 2.020000 3.0 0.681689 SE NC
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... -16.13% Decrease 1.560000 3.0 0.147986 1.860000 3.0 0.541479 SE NC

653 rows × 23 columns

In [10]:
 df = df[df['Endpoint'].isin(['shannon diversity', 'simpson diversity', 'species richness']) ]
In [11]:
df
Out[11]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Magnitude.or.degree.of.change Magnitude.or.degree.of.change..text. Response..Measurement Response.Number.of.Observations Response.Standard.Deviation Control.Reference.Measurement Control.Number.of.Observations Control.Standard.Deviation general.area Location
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... 58.96% Increase 76.300000 84.0 3.500000 48.000000 56.0 4.900000 PNW Oregon
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... 79.17% Increase 86.000000 72.0 13.600000 48.000000 56.0 4.900000 PNW Oregon
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... 11.88% Increase 53.700000 24.0 5.400000 48.000000 56.0 4.900000 PNW Oregon
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... 14.58% Increase 55.000000 28.0 3.500000 48.000000 56.0 4.900000 PNW Oregon
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... 24.38% Increase 59.700000 16.0 7.300000 48.000000 56.0 4.900000 PNW Oregon
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 20.91% Increase 2.313333 3.0 1.927287 1.913333 3.0 0.592818 SE AL
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 0.53% Increase 1.893333 3.0 0.015275 1.883333 3.0 0.496521 SE NC
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 16.25% Increase 2.170000 3.0 0.286182 1.866667 3.0 0.617441 SE NC
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... -26.90% Decrease 1.476667 3.0 0.167730 2.020000 3.0 0.681689 SE NC
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... -16.13% Decrease 1.560000 3.0 0.147986 1.860000 3.0 0.541479 SE NC

192 rows × 23 columns

Removing missing values

In [12]:
# Getting rid of rows with missing data on the response variable

df = df[df['Response..Measurement'].notna()]
In [13]:
# Getting rid of rows with missing data on the response variable standard deviation

df = df[df['Response.Standard.Deviation'].notna()]
In [14]:
# Getting rid of rows with missing data on the response variable sample size

df = df[df['Response.Number.of.Observations'].notna()]
In [15]:
# Getting rid of rows with missing data on the control variable standard deviation

df = df[df['Control.Reference.Measurement'].notna()]
In [16]:
# Getting rid of rows with missing data on the response variable standard deviation

df = df[df['Control.Standard.Deviation'].notna()]
In [17]:
# Getting rid of rows with missing data on the response variable standard deviation

df = df[df['Control.Number.of.Observations'].notna()]
In [18]:
df
Out[18]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Magnitude.or.degree.of.change Magnitude.or.degree.of.change..text. Response..Measurement Response.Number.of.Observations Response.Standard.Deviation Control.Reference.Measurement Control.Number.of.Observations Control.Standard.Deviation general.area Location
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... 58.96% Increase 76.300000 84.0 3.500000 48.000000 56.0 4.900000 PNW Oregon
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... 79.17% Increase 86.000000 72.0 13.600000 48.000000 56.0 4.900000 PNW Oregon
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... 11.88% Increase 53.700000 24.0 5.400000 48.000000 56.0 4.900000 PNW Oregon
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... 14.58% Increase 55.000000 28.0 3.500000 48.000000 56.0 4.900000 PNW Oregon
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... 24.38% Increase 59.700000 16.0 7.300000 48.000000 56.0 4.900000 PNW Oregon
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 20.91% Increase 2.313333 3.0 1.927287 1.913333 3.0 0.592818 SE AL
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 0.53% Increase 1.893333 3.0 0.015275 1.883333 3.0 0.496521 SE NC
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 16.25% Increase 2.170000 3.0 0.286182 1.866667 3.0 0.617441 SE NC
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... -26.90% Decrease 1.476667 3.0 0.167730 2.020000 3.0 0.681689 SE NC
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... -16.13% Decrease 1.560000 3.0 0.147986 1.860000 3.0 0.541479 SE NC

154 rows × 23 columns

Parsing out year from reference to remove prior to 1996

In [19]:
df['Year'] = df['Reference'].astype('str').str.extractall('(\d+)').unstack().fillna('').sum(axis=1).astype(int)
In [20]:
df['Year'] = df['Year'].astype(float).apply(lambda x:x if (x > 0) else 9999)
In [21]:
df[df['Year'] > 1995]
Out[21]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Magnitude.or.degree.of.change..text. Response..Measurement Response.Number.of.Observations Response.Standard.Deviation Control.Reference.Measurement Control.Number.of.Observations Control.Standard.Deviation general.area Location Year
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... Increase 76.300000 84.0 3.500000 48.000000 56.0 4.900000 PNW Oregon 2010.0
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... Increase 86.000000 72.0 13.600000 48.000000 56.0 4.900000 PNW Oregon 2010.0
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... Increase 53.700000 24.0 5.400000 48.000000 56.0 4.900000 PNW Oregon 2010.0
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... Increase 55.000000 28.0 3.500000 48.000000 56.0 4.900000 PNW Oregon 2010.0
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... Increase 59.700000 16.0 7.300000 48.000000 56.0 4.900000 PNW Oregon 2010.0
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... Increase 2.313333 3.0 1.927287 1.913333 3.0 0.592818 SE AL 9999.0
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... Increase 1.893333 3.0 0.015275 1.883333 3.0 0.496521 SE NC 9999.0
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... Increase 2.170000 3.0 0.286182 1.866667 3.0 0.617441 SE NC 9999.0
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... Decrease 1.476667 3.0 0.167730 2.020000 3.0 0.681689 SE NC 9999.0
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... Decrease 1.560000 3.0 0.147986 1.860000 3.0 0.541479 SE NC 9999.0

154 rows × 24 columns

In [22]:
# Create a function that we can re-use
# This block of code will create a histogram. You don't have to use all of this code next time you want to create one
def show_distribution(var_data):
    from matplotlib import pyplot as plt

    # Get statistics
    min_val = var_data.min()
    max_val = var_data.max()
    mean_val = var_data.mean()
    med_val = var_data.median()
    mod_val = var_data.mode()[0]

    print('Minimum:{:.2f}\nMean:{:.2f}\nMedian:{:.2f}\nMode:{:.2f}\nMaximum:{:.2f}\n'.format(min_val,
                                                                                            mean_val,
                                                                                            med_val,
                                                                                            mod_val,
                                                                                            max_val))

    # Create a figure for 2 subplots (2 rows, 1 column)
    fig, ax = plt.subplots(2, 1, figsize = (10,4))

    # Plot the histogram   
    ax[0].hist(var_data)
    ax[0].set_ylabel('Frequency')

    # Add lines for the mean, median, and mode
    ax[0].axvline(x=min_val, color = 'gray', linestyle='dashed', linewidth = 2)
    ax[0].axvline(x=mean_val, color = 'cyan', linestyle='dashed', linewidth = 2)
    ax[0].axvline(x=med_val, color = 'red', linestyle='dashed', linewidth = 2)
    ax[0].axvline(x=mod_val, color = 'yellow', linestyle='dashed', linewidth = 2)
    ax[0].axvline(x=max_val, color = 'gray', linestyle='dashed', linewidth = 2)

    # Plot the boxplot   
    ax[1].boxplot(var_data, vert=False)
    ax[1].set_xlabel('Value')

    # Add a title to the Figure
    fig.suptitle('Data Distribution')

    # Show the figure
    fig.show()


col = df['Response..Measurement']
# Call the function
show_distribution(col)
Minimum:-1.60
Mean:5.87
Median:2.06
Mode:3.60
Maximum:86.00

/Users/jeanetteshutay/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:41: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
In [23]:
# Get the variable to examine
col = df['Control.Reference.Measurement']
# Call the function
show_distribution(col)
Minimum:-1.68
Mean:4.93
Median:1.86
Mode:0.70
Maximum:53.00

/Users/jeanetteshutay/opt/anaconda3/lib/python3.7/site-packages/ipykernel_launcher.py:41: UserWarning: Matplotlib is currently using module://ipykernel.pylab.backend_inline, which is a non-GUI backend, so cannot show the figure.
In [24]:
# Exploring relationships - this code will create scatter plots 

import matplotlib.pyplot as plt
from pandas.plotting import scatter_matrix
  
# selecting numerical features
features = ['Response..Measurement', 'Control.Reference.Measurement', 'Response.Standard.Deviation', 'Control.Standard.Deviation']
   
# plotting the scatter matrix
scatter_matrix(df[features], figsize=(12,12))
plt.xticks(rotation=90)
plt.show()
In [25]:
# If you want to compare groups based on one or more quantitative variables, use this code

col_list = ['Management.Method', 'Response..Measurement', 'Control.Reference.Measurement']
short_df = df[col_list]

import seaborn as sns

rs=1999
 
df_long = pd.melt(short_df.sample(150,random_state=rs), "Management.Method", var_name="Columns", value_name="Values")   
f,ax = plt.subplots(figsize=(16,8))
#plt.xticks(rotation=90) 
plt.ylim(0, 125) 
#plt.xlim(0, None) 
sns.boxplot(x="Columns", y="Values", hue="Management.Method", data=df_long)
Out[25]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd553fa5e90>
In [26]:
df['ts2'] = df['Response.Standard.Deviation']**2
In [27]:
df['cs2'] = df['Control.Standard.Deviation']**2
In [28]:
df['term1'] = ((df['Response.Number.of.Observations'] - 1) * (df['ts2']))
In [29]:
df['term2'] = ((df['Control.Number.of.Observations'] - 1) * (df['cs2']))
In [30]:
df['numerator'] = df['term1'] + df['term2']
In [31]:
df['denominator'] = df['Response.Number.of.Observations'] + df['Control.Number.of.Observations'] - 2
In [32]:
df['divided'] = df['numerator'] / df['denominator']
In [33]:
df['pooledVar']=df['divided']**(1/2)
In [34]:
df['target'] = ((df['Response..Measurement'] - df['Control.Reference.Measurement']) / (df['pooledVar']))
In [35]:
# Getting rid of rows with missing data on the response variable

df = df[df['target'].notna()]
In [36]:
df
Out[36]:
Reference Habitat.Type Management.Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Year ts2 cs2 term1 term2 numerator denominator divided pooledVar target
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 12.250000 24.010000 1016.750000 1320.550000 2337.300000 138.0 16.936957 4.115453 6.876521
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 184.960000 24.010000 13132.160000 1320.550000 14452.710000 126.0 114.704048 10.709998 3.548087
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 29.160000 24.010000 670.680000 1320.550000 1991.230000 78.0 25.528590 5.052582 1.128136
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 12.250000 24.010000 330.750000 1320.550000 1651.300000 82.0 20.137805 4.487517 1.559883
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 53.290000 24.010000 799.350000 1320.550000 2119.900000 70.0 30.284286 5.503116 2.126068
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 9999.0 3.714433 0.351433 7.428867 0.702867 8.131733 4.0 2.032933 1.425810 0.280542
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.000233 0.246533 0.000467 0.493067 0.493533 4.0 0.123383 0.351260 0.028469
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.081900 0.381233 0.163800 0.762467 0.926267 4.0 0.231567 0.481214 0.630351
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.028133 0.464700 0.056267 0.929400 0.985667 4.0 0.246417 0.496404 -1.094539
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.021900 0.293200 0.043800 0.586400 0.630200 4.0 0.157550 0.396926 -0.755809

154 rows × 33 columns

In [37]:
# If you want to compare groups based on one or more quantitative variables, use this code

col_list = ['Management.Method', 'target']
short_df = df[col_list]

import seaborn as sns

rs=1999
 
df_long = pd.melt(short_df.sample(150,random_state=rs), "Management.Method", var_name="Management Method", value_name="Effect Size")   
f,ax = plt.subplots(figsize=(16,8))
#plt.xticks(rotation=90) 
plt.ylim(-10, 10) 
#plt.xlim(0, None) 
sns.boxplot(x="Management Method", y="Effect Size", hue="Management.Method", data=df_long)
Out[37]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd55413b990>
In [38]:
# If you want to compare groups based on one or more quantitative variables, use this code

col_list = ['Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.', 'target']
short_df = df[col_list]

import seaborn as sns

rs=1999
 
df_long = pd.melt(short_df.sample(150,random_state=rs), "Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.", var_name="Columns", value_name="Values")   
f,ax = plt.subplots(figsize=(16,8))
#plt.xticks(rotation=90) 
plt.ylim(-10, 10) 
#plt.xlim(0, None) 
sns.boxplot(x="Columns", y="Values", hue="Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.", data=df_long)
Out[38]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd5553aa210>
In [39]:
#Keeping only the columns we want

cols_to_keep = ['Reference',
 'Habitat.Type',
 'Management.Method',
 'Management.Effect.Type',
 'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
 'Endpoint',
 'Measurement.Unit',
 'Endpoint.Modifier',
 'Magnitude.or.degree.of.change',
 'Magnitude.or.degree.of.change..text.',
 'Response..Measurement',
 'Response.Number.of.Observations',
 'Response.Standard.Deviation',
 'Control.Reference.Measurement',
 'Control.Number.of.Observations',
 'Control.Standard.Deviation',
 'general.area',
 'Location',
 'Year',
 'target']
df2=df[cols_to_keep]
In [40]:
#Create dummy variables 

combine = pd.concat([df2,pd.get_dummies(df['Management.Method'], prefix='ID')],axis=1)
combine.columns
Out[40]:
Index(['Reference', 'Habitat.Type', 'Management.Method',
       'Management.Effect.Type',
       'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
       'Endpoint', 'Measurement.Unit', 'Endpoint.Modifier',
       'Magnitude.or.degree.of.change', 'Magnitude.or.degree.of.change..text.',
       'Response..Measurement', 'Response.Number.of.Observations',
       'Response.Standard.Deviation', 'Control.Reference.Measurement',
       'Control.Number.of.Observations', 'Control.Standard.Deviation',
       'general.area', 'Location', 'Year', 'target', 'ID_Carbon Addition',
       'ID_Prescribed Burn', 'ID_Thinning'],
      dtype='object')
In [41]:
#Create dummy variables 

combine2 = pd.concat([combine,pd.get_dummies(combine['Habitat.Type'], prefix='ID')],axis=1)
combine2.columns
Out[41]:
Index(['Reference', 'Habitat.Type', 'Management.Method',
       'Management.Effect.Type',
       'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
       'Endpoint', 'Measurement.Unit', 'Endpoint.Modifier',
       'Magnitude.or.degree.of.change', 'Magnitude.or.degree.of.change..text.',
       'Response..Measurement', 'Response.Number.of.Observations',
       'Response.Standard.Deviation', 'Control.Reference.Measurement',
       'Control.Number.of.Observations', 'Control.Standard.Deviation',
       'general.area', 'Location', 'Year', 'target', 'ID_Carbon Addition',
       'ID_Prescribed Burn', 'ID_Thinning', 'ID_Coniferous Forest',
       'ID_Deciduous Forest', 'ID_Forest'],
      dtype='object')
In [42]:
combine3 = pd.concat([combine2,pd.get_dummies(combine2['Management.Effect.Type'], prefix='ID')],axis=1)
combine3.columns
Out[42]:
Index(['Reference', 'Habitat.Type', 'Management.Method',
       'Management.Effect.Type',
       'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
       'Endpoint', 'Measurement.Unit', 'Endpoint.Modifier',
       'Magnitude.or.degree.of.change', 'Magnitude.or.degree.of.change..text.',
       'Response..Measurement', 'Response.Number.of.Observations',
       'Response.Standard.Deviation', 'Control.Reference.Measurement',
       'Control.Number.of.Observations', 'Control.Standard.Deviation',
       'general.area', 'Location', 'Year', 'target', 'ID_Carbon Addition',
       'ID_Prescribed Burn', 'ID_Thinning', 'ID_Coniferous Forest',
       'ID_Deciduous Forest', 'ID_Forest', 'ID_community composition'],
      dtype='object')
In [43]:
combine4 = pd.concat([combine3,pd.get_dummies(combine3['Location'], prefix='ID')],axis=1)
combine4.columns
Out[43]:
Index(['Reference', 'Habitat.Type', 'Management.Method',
       'Management.Effect.Type',
       'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
       'Endpoint', 'Measurement.Unit', 'Endpoint.Modifier',
       'Magnitude.or.degree.of.change', 'Magnitude.or.degree.of.change..text.',
       'Response..Measurement', 'Response.Number.of.Observations',
       'Response.Standard.Deviation', 'Control.Reference.Measurement',
       'Control.Number.of.Observations', 'Control.Standard.Deviation',
       'general.area', 'Location', 'Year', 'target', 'ID_Carbon Addition',
       'ID_Prescribed Burn', 'ID_Thinning', 'ID_Coniferous Forest',
       'ID_Deciduous Forest', 'ID_Forest', 'ID_community composition',
       'ID_ AL', 'ID_ AZ', 'ID_ CA', 'ID_ MT', 'ID_ NC', 'ID_ SC',
       'ID_Colorado', 'ID_Estonia', 'ID_FL', 'ID_OH', 'ID_Oregon', 'ID_WA'],
      dtype='object')
In [44]:
combine5 = pd.concat([combine4,pd.get_dummies(combine4['Endpoint'], prefix='ID')],axis=1)
combine5.columns
Out[44]:
Index(['Reference', 'Habitat.Type', 'Management.Method',
       'Management.Effect.Type',
       'Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna.',
       'Endpoint', 'Measurement.Unit', 'Endpoint.Modifier',
       'Magnitude.or.degree.of.change', 'Magnitude.or.degree.of.change..text.',
       'Response..Measurement', 'Response.Number.of.Observations',
       'Response.Standard.Deviation', 'Control.Reference.Measurement',
       'Control.Number.of.Observations', 'Control.Standard.Deviation',
       'general.area', 'Location', 'Year', 'target', 'ID_Carbon Addition',
       'ID_Prescribed Burn', 'ID_Thinning', 'ID_Coniferous Forest',
       'ID_Deciduous Forest', 'ID_Forest', 'ID_community composition',
       'ID_ AL', 'ID_ AZ', 'ID_ CA', 'ID_ MT', 'ID_ NC', 'ID_ SC',
       'ID_Colorado', 'ID_Estonia', 'ID_FL', 'ID_OH', 'ID_Oregon', 'ID_WA',
       'ID_shannon diversity', 'ID_simpson diversity', 'ID_species richness'],
      dtype='object')

Decision trees

In [45]:
combine5.head(10)
Out[45]:
Reference Habitat.Type Management.Method Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. Endpoint Measurement.Unit Endpoint.Modifier Magnitude.or.degree.of.change Magnitude.or.degree.of.change..text. ... ID_ SC ID_Colorado ID_Estonia ID_FL ID_OH ID_Oregon ID_WA ID_shannon diversity ID_simpson diversity ID_species richness
55 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species NaN 58.96% Increase ... 0 0 0 0 0 1 0 0 0 1
56 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species NaN 79.17% Increase ... 0 0 0 0 0 1 0 0 0 1
57 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species NaN 11.88% Increase ... 0 0 0 0 0 1 0 0 0 1
58 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species NaN 14.58% Increase ... 0 0 0 0 0 1 0 0 0 1
59 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species NaN 24.38% Increase ... 0 0 0 0 0 1 0 0 0 1
65 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species early seral species 135.90% Increase ... 0 0 0 0 0 1 0 0 0 1
66 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species early seral species 115.38% Increase ... 0 0 0 0 0 1 0 0 0 1
67 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species early seral species 89.74% Increase ... 0 0 0 0 0 1 0 0 0 1
68 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species early seral species 164.10% Increase ... 0 0 0 0 0 1 0 0 0 1
69 Ares et al 2010 Coniferous Forest Thinning community composition Plant species richness number of species early seral species 123.08% Increase ... 0 0 0 0 0 1 0 0 0 1

10 rows × 42 columns

In [46]:
#conda install graphviz
In [47]:
#conda update -n base -c defaults conda
In [48]:
# Creating a list of features for the decision tree
# Include in this list, all the predictor variables that you want to use in your decision tree

features=['ID_Carbon Addition',
 'ID_Prescribed Burn', 
 'ID_Thinning',
 'ID_Coniferous Forest', 
 'ID_Deciduous Forest', 
 'ID_Forest',
 'ID_community composition',
 'ID_shannon diversity', 
 'ID_simpson diversity', 
 'ID_species richness', 
 ]
In [49]:
target=['target']
In [50]:
# This block of code runs the decision tree analysis
# Replace TargetVar with the name of the target or dependent variable of interest
# max_depth and max_leaf_nodes are hyper parameters that are set by the researcher, they limit the size of the tree

from sklearn.tree import DecisionTreeRegressor, export_graphviz
from sklearn import datasets, tree

y = combine5["target"]
X = combine5[features]#.astype(float)
dt = DecisionTreeRegressor(min_samples_split=10, min_samples_leaf=10, random_state=99, max_depth=5, max_leaf_nodes=7)
dt.fit(X, y)
Out[50]:
DecisionTreeRegressor(criterion='mse', max_depth=5, max_features=None,
                      max_leaf_nodes=7, min_impurity_decrease=0.0,
                      min_impurity_split=None, min_samples_leaf=10,
                      min_samples_split=10, min_weight_fraction_leaf=0.0,
                      presort=False, random_state=99, splitter='best')
In [51]:
text_representation = tree.export_text(dt)
print(text_representation)
|--- feature_3 <= 0.50
|   |--- feature_9 <= 0.50
|   |   |--- feature_2 <= 0.50
|   |   |   |--- feature_7 <= 0.50
|   |   |   |   |--- value: [-0.50]
|   |   |   |--- feature_7 >  0.50
|   |   |   |   |--- value: [-0.32]
|   |   |--- feature_2 >  0.50
|   |   |   |--- feature_7 <= 0.50
|   |   |   |   |--- value: [-0.30]
|   |   |   |--- feature_7 >  0.50
|   |   |   |   |--- value: [-0.19]
|   |--- feature_9 >  0.50
|   |   |--- feature_2 <= 0.50
|   |   |   |--- value: [0.33]
|   |   |--- feature_2 >  0.50
|   |   |   |--- value: [0.45]
|--- feature_3 >  0.50
|   |--- value: [2.45]

In [52]:
with open("decistion_tree.log", "w") as fout:
    fout.write(text_representation)
In [53]:
fig = plt.figure(figsize=(25,20))
_ = tree.plot_tree(dt, 
                   feature_names=features,  
                   class_names=target,
                   filled=True)
In [54]:
# If you want to compare groups based on one or more quantitative variables, use this code

col_list = ['Habitat.Type', 'target']
short_df = df[col_list]

import seaborn as sns

rs=1999
 
df_long = pd.melt(short_df.sample(150,random_state=rs), "Habitat.Type", var_name="Habitat Type", value_name="Effect Size")   
f,ax = plt.subplots(figsize=(16,8))
#plt.xticks(rotation=90) 
plt.ylim(-10, 10) 
#plt.xlim(0, None) 
sns.boxplot(x="Habitat Type", y="Effect Size", hue="Habitat.Type", data=df_long)
Out[54]:
<matplotlib.axes._subplots.AxesSubplot at 0x7fd556345090>
In [55]:
#conda install -c plotly plotly=5.10.0
In [56]:
# selecting rows based on condition
richness = df[df['Endpoint'] == 'species richness']
In [57]:
richness.rename(columns = {'target': 'Effect Size', 'Management.Method':'Management Method', 'Habitat.Type':'Habitat Type'}, inplace = True)
richness
/Users/jeanetteshutay/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py:4223: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  return super().rename(**kwargs)
Out[57]:
Reference Habitat Type Management Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Year ts2 cs2 term1 term2 numerator denominator divided pooledVar Effect Size
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 12.250000 24.010000 1016.750000 1320.550000 2337.300000 138.0 16.936957 4.115453 6.876521
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 184.960000 24.010000 13132.160000 1320.550000 14452.710000 126.0 114.704048 10.709998 3.548087
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 29.160000 24.010000 670.680000 1320.550000 1991.230000 78.0 25.528590 5.052582 1.128136
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 12.250000 24.010000 330.750000 1320.550000 1651.300000 82.0 20.137805 4.487517 1.559883
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 53.290000 24.010000 799.350000 1320.550000 2119.900000 70.0 30.284286 5.503116 2.126068
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 9999.0 3.714433 0.351433 7.428867 0.702867 8.131733 4.0 2.032933 1.425810 0.280542
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.000233 0.246533 0.000467 0.493067 0.493533 4.0 0.123383 0.351260 0.028469
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.081900 0.381233 0.163800 0.762467 0.926267 4.0 0.231567 0.481214 0.630351
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.028133 0.464700 0.056267 0.929400 0.985667 4.0 0.246417 0.496404 -1.094539
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.021900 0.293200 0.043800 0.586400 0.630200 4.0 0.157550 0.396926 -0.755809

95 rows × 33 columns

In [61]:
import plotly.express as px
In [62]:
px.box(richness, x="Effect Size", y="Management Method", orientation="h", color="Habitat Type", notched=True,
      category_orders={"Management Method": ["Carbon Addition", "Prescribed Burn", "Thinning"]})
In [63]:
# selecting rows based on condition
shannon = df[df['Endpoint'] == 'shannon diversity']
In [64]:
shannon.rename(columns = {'target': 'Effect Size', 'Management.Method':'Management Method', 'Habitat.Type':'Habitat Type'}, inplace = True)
richness
/Users/jeanetteshutay/opt/anaconda3/lib/python3.7/site-packages/pandas/core/frame.py:4223: SettingWithCopyWarning:


A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

Out[64]:
Reference Habitat Type Management Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Year ts2 cs2 term1 term2 numerator denominator divided pooledVar Effect Size
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 12.250000 24.010000 1016.750000 1320.550000 2337.300000 138.0 16.936957 4.115453 6.876521
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 184.960000 24.010000 13132.160000 1320.550000 14452.710000 126.0 114.704048 10.709998 3.548087
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 29.160000 24.010000 670.680000 1320.550000 1991.230000 78.0 25.528590 5.052582 1.128136
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 12.250000 24.010000 330.750000 1320.550000 1651.300000 82.0 20.137805 4.487517 1.559883
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 53.290000 24.010000 799.350000 1320.550000 2119.900000 70.0 30.284286 5.503116 2.126068
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 9999.0 3.714433 0.351433 7.428867 0.702867 8.131733 4.0 2.032933 1.425810 0.280542
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.000233 0.246533 0.000467 0.493067 0.493533 4.0 0.123383 0.351260 0.028469
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.081900 0.381233 0.163800 0.762467 0.926267 4.0 0.231567 0.481214 0.630351
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.028133 0.464700 0.056267 0.929400 0.985667 4.0 0.246417 0.496404 -1.094539
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.021900 0.293200 0.043800 0.586400 0.630200 4.0 0.157550 0.396926 -0.755809

95 rows × 33 columns

In [65]:
px.box(shannon, x="Effect Size", y="Management Method", orientation="h", color="Habitat Type", notched=True,
      category_orders={"Management Method": ["Carbon Addition", "Prescribed Burn", "Thinning"]})
In [66]:
# selecting rows based on condition
simpson = df[df['Endpoint'] == 'simpson diversity']
In [67]:
simpson.rename(columns = {'target': 'Effect Size', 'Management.Method':'Management Method', 'Habitat.Type':'Habitat Type'}, inplace = True)
richness
Out[67]:
Reference Habitat Type Management Method Management.Intensity Duration.of.treatment..years. Years.since.Treatment..if.applicable. Significant. Alpha Management.Effect.Type Ecosystem.element.assessed..soil.plant.microbe.fungi.fauna. ... Year ts2 cs2 term1 term2 numerator denominator divided pooledVar Effect Size
55 Ares et al 2010 Coniferous Forest Thinning Thinning to 300 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 12.250000 24.010000 1016.750000 1320.550000 2337.300000 138.0 16.936957 4.115453 6.876521
56 Ares et al 2010 Coniferous Forest Thinning Thinning to 200 trees/ha NaN 11 Y 0.05 community composition Plant ... 2010.0 184.960000 24.010000 13132.160000 1320.550000 14452.710000 126.0 114.704048 10.709998 3.548087
57 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 300 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 29.160000 24.010000 670.680000 1320.550000 1991.230000 78.0 25.528590 5.052582 1.128136
58 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 200 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 12.250000 24.010000 330.750000 1320.550000 1651.300000 82.0 20.137805 4.487517 1.559883
59 Ares et al 2010 Coniferous Forest Thinning Variable Thinning to 100 trees/ha NaN 11 N 0.05 community composition Plant ... 2010.0 53.290000 24.010000 799.350000 1320.550000 2119.900000 70.0 30.284286 5.503116 2.126068
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
1905 FFS Data - Gulf Coastal Plain Forest Thinning NaN NaN 2 NaN NaN community composition Plant ... 9999.0 3.714433 0.351433 7.428867 0.702867 8.131733 4.0 2.032933 1.425810 0.280542
1906 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.000233 0.246533 0.000467 0.493067 0.493533 4.0 0.123383 0.351260 0.028469
1907 FFS Data - Southern Appalachian Mts. Forest Prescribed Burn NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.081900 0.381233 0.163800 0.762467 0.926267 4.0 0.231567 0.481214 0.630351
1908 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 1 NaN NaN community composition Plant ... 9999.0 0.028133 0.464700 0.056267 0.929400 0.985667 4.0 0.246417 0.496404 -1.094539
1909 FFS Data - Southern Appalachian Mts. Forest Thinning NaN NaN 3 NaN NaN community composition Plant ... 9999.0 0.021900 0.293200 0.043800 0.586400 0.630200 4.0 0.157550 0.396926 -0.755809

95 rows × 33 columns

In [68]:
px.box(simpson, x="Effect Size", y="Management Method", orientation="h", color="Habitat Type", notched=True,
      category_orders={"Management Method": ["Carbon Addition", "Prescribed Burn", "Thinning"]})